Without-Replacement Sampling for Stochastic Gradient Methods

نویسنده

  • Ohad Shamir
چکیده

Stochastic gradient methods for machine learning and optimization problems are usually analyzed assuming data points are sampled with replacement. In contrast, sampling without replacement is far less understood, yet in practice it is very common, often easier to implement, and usually performs better. In this paper, we provide competitive convergence guarantees for without-replacement sampling under several scenarios, focusing on the natural regime of few passes over the data. Moreover, we describe a useful application of these results in the context of distributed optimization with randomly-partitioned data, yielding a nearly-optimal algorithm for regularized least squares (in terms of both communication complexity and runtime complexity) under broad parameter regimes. Our proof techniques combine ideas from stochastic optimization, adversarial online learning and transductive learning theory, and can potentially be applied to other stochastic optimization and learning problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Without-Replacement Sampling for Stochastic Gradient Methods: Convergence Results and Application to Distributed Optimization

Stochastic gradient methods for machine learning and optimization problems are usually analyzed assuming data points are sampled with replacement. In practice, however, sampling without replacement is very common, easier to implement in many cases, and often performs better. In this paper, we provide competitive convergence guarantees for without-replacement sampling, under various scenarios, f...

متن کامل

Beneath the valley of the noncommutative arithmetic-geometric mean inequality: conjectures, case-studies, and consequences

Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling withand without-replacement in such algorithms. Focusing on least means squares...

متن کامل

Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences

Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling withand without-replacement in such algorithms. Focusing on least means squares...

متن کامل

SGD Algorithms based on Incomplete U-statistics: Large-Scale Minimization of Empirical Risk

In many learning problems, ranging from clustering to ranking through metric learning, empirical estimates of the risk functional consist of an average over tuples (e.g., pairs or triplets) of observations, rather than over individual observations. In this paper, we focus on how to best implement a stochastic approximation approach to solve such risk minimization problems. We argue that in the ...

متن کامل

Sample Complexity of Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization

The popular cubic regularization (CR) method converges with firstand second-order optimality guarantee for nonconvex optimization, but encounters a high sample complexity issue for solving large-scale problems. Various sub-sampling variants of CR have been proposed to improve the sample complexity. In this paper, we propose a stochastic variance-reduced cubic-regularized (SVRC) Newton’s method ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016